Mini Project - Dermatologist - AI

Most of the images above do look alike. So how do we confirm which is what then? Do you need to consult a doctor immediately?

In this mini project, I will design an algorithm that can visually diagnose melanoma, the deadliest form of skin cancer. In particular, my algorithm will distinguish this malignant skin tumor from two types of benign lesions (nevi and seborrheic keratoses).

The data and objective are pulled from the __2017 ISIC Challenge on Skin Lesion Analysis Towards Melanoma Detection__. As part of the challenge, participants were tasked to design an algorithm to diagnose skin lesion images as one of three different skin diseases (melanoma, nevus, or seborrheic keratosis). In this project, I will create a model to generate my own predictions using my developed model.

Could AI do a better job than doctors?

There are characteristics that doctors use when identifying cancerous skin lesions e.g. fuzziness of the border, the assymmetry, the coloration, the growth rate if it is accessible.

Undoubtedly, trained human eye can be rather good at finding skin spots that maybe cancerous. but there are times some best doctors could miss some prominent cases. Hence, this process takes in an artificial intelligence (AI) algo in detecting skin diseases.

Wehat that means is we can, now, use our own computer where it takes in the full-body photos of a patient's skin, then sends the high-resolution photos through a neural network model (AI) to denote the type of skin disease and if the risk of any skin spots being malignant.

To ensure we get a good diagnosis on the skin image, we need to ensure our AI model has a good sound accuracy rate.

How AI can help us?

Background

__Melanoma__


Note, this is also known as: Skin cancer, malignant melanoma.

What is melanoma?
Melanoma is a type of skin cancer which usually occurs on the parts of the body that have been overexposed to the sun. Rare melanomas can occur inside the eye or in parts of the skin or body that have never been exposed to the sun.

Melanoma is projected to be the third most common cancer diagnosed in Australia in 2020, which along with New Zealand has the world's highest incidence rate for melanoma. Melanoma is more commonly diagnosed in men than women. The risk of being diagnosed with melanoma by age 85 is 1 in 13 for men compared to 1 in 21 for women.

It is estimated that 16,221 new cases of melanoma will be diagnosed in Australia in 2020.

Melanoma symptoms
Often melanoma has no symptoms, however, the first sign is generally a change in an existing mole or the appearance of a new spot. These changes can include:

New moles and spots will appear and change during childhood, adolescence and during pregnancy and this is normal. However, adults who develop new spots or moles should have them examined by their doctor.

Causes of melanoma
Melanoma risk increases with exposure to UV radiation from the sun or other sources such as solariums, particularly with episodes of sunburn (especially during childhood).

Melanoma risk is increased for people who have:

Diagnosis of melanoma
Melanoma can vary in the way it looks. The first sign is usually a new spot or change in an existing mole.

Physical examination
If you do notice any changes to your skin, your doctor will examine you and carefully check any spots you have identified as changed. Your doctor will use a handheld magnifying instrument (dermascope) and consider the criteria known as “ABCDE”. Further tests may be carried out by your GP or you may be referred to a specialist (dermatologist).

Biopsy
If the doctor suspects that a spot on your skin could be melanoma, an excision biopsy is carried out with the removal of the whole spot. This will then be examined under a microscope by a specialist to see if there are any cancer cells.

Checking lymph nodes
Your doctor may feel the lymph nodes near the melanoma to see if they are enlarged as melanoma can sometimes travel via the lymph vessels to other parts of your body. Your doctor may also recommend a biopsy to take a sample of the cells from an enlarged lymph node for further examination under a microscope.

If the doctor suspects melanoma, a biopsy may be carried out. This may be done by your GP or you may be referred to another specialist.

After a diagnosis of melanoma
After being diagnosed with melanoma, you may feel shocked, upset, anxious or confused. These are normal responses. A diagnosis of melanoma affects each person differently. For most it will be a difficult time, however some people manage to continue with their normal daily activities.

__Nevus__


What is a nevus?
The word nevus indicates a benign (non-cancerous) skin or mucosal lesion comprising an abnormal mixture of a tissue’s normal components, usually presenting at birth or at a young age. A nevus is a congenital (present at birth) or acquired growths or pigmented blemishes on the skin; birthmarks or moles. As an example, mole is a melanocytic nevus. A nevus may also form from other skin cells (e.g., vascular nevus, which are formed from blood vessels). Some of these are also congenital (present at birth).

Nevus symptoms
Intradermal nevi appear as flesh-colored bumps on the surface of the skin, though they can also appear slightly brown. In some cases, they’ll contain brown spots of small dilated blood vessels.

Causes of nevus
An intradermal nevus is the result of one of three causes:

Diagnosis of nevus
Unless your mole has recently changed in size, shape, or color, no treatment is necessary for an intradermal nevus. However, it is possible to remove the mole if that’s what you’d like.

__Seborrheic Keratoses__


This also known as SK.

What is a seborrheic keratoses?
A seborrheic keratosis (seb-o-REE-ik ker-uh-TOE-sis) is a common noncancerous skin growth. People tend to get more of them as they get older.Seborrheic keratoses are usually brown, black or light tan. The growths look waxy, scaly and slightly raised. They usually appear on the head, neck, chest or back.Seborrheic keratoses are harmless and not contagious.

Seborrheic Keratoses symptoms
A seborrheic keratosis usually looks like a waxy or wartlike growth. It typically appears on the face, chest, shoulders or back. You may develop a single growth, though multiple growths are more common.
A seborrheic keratosis:

Causes of seborrheic keratoses
Doctors don't know exactly what causes seborrheic keratoses. The growths tend to run in some families, so genes may play a role.

Diagnosis of seborrheic keratoses
Diagnosis can be made by physical examination or skin biopsy.

Standard Boiler Plates | Tools & Scikit Learn Models

Before we get started, let's bring in all the required data tools and models for this project. On normal circumstances, I like to break this mechanics into two sub-parts:

Exploratory Data Analysis

Now, let's check out a few things before we get to the real stuff going.

Step 1 | Ensure all images are stored in an existing folder

Step 2 | Check and verify the number of images in each folder.

The breakdown on each folder is as follows:

Step 3 | Review a few images randomly to ensure photos/pictures are accessible

Specify DataLoaders & Transform Dataset

Data loading is one of the first steps in building a Deep Learning pipeline, or training a model. This task becomes more challenging when the complexity of the data increases.

In this section, we will learn about the DataLoader class in PyTorch that helps us to load and iterate over elements in a dataset. This class is available as DataLoader in the torch.utils.data module.

Note:

Batch size is one of the most important hyperparameters to tune in deep learning. I prefer to use a larger batch size to train my models as it allows computational speedups from the parallelism of GPUs. However, it is well known that too large of a batch size will lead to poor generalization. On the one extreme, using a batch equal to the entire dataset guarantees convergence to the global optima of the objective function. However this is at the cost of slower convergence to that optima. On the other hand, using smaller batch sizes have been shown to have faster convergence to good results. This is intuitively explained by the fact that smaller batch sizes allow the model to start learning before having to see all the data. The downside of using a smaller batch size is that the model is not guaranteed to converge to the global optima.Therefore it is often advised that one starts at a small batch size reaping the benefits of faster training dynamics and steadily grows the batch size through training.

Implement CNN Model using Transfer Learning technique

CPU vs GPU-enabled

To simplify the model architecture, I am using transfer learning from a pre-trained model to create a CNN that can identify type of cancer from images. As we are dealing with images which has higher pixel resolutions, it means the processing is likely to take up days if we running this on our local computers. As such, it is highly suggested that we leverage on the use of GPU.

Training neural networks with capable GPU(s) is particularly helpful in cutting down the time it takes for training, so it is an absolute necessity to be able to do so using GPU. The following code assesses if GPU is used when training RNN model. Otherwise, it is highly recommended to use.

Why do we need GPU?

GPU stands for Graphical Processing Unit. It was originally designed to accelerate the rendering of 3D graphics. Over time, they became more flexible and programmable, enhancing their capabilities. Many graphics programmers leverage on this to create more interesting visual effects and realistic scenes with advanced lighting and shadowing techniques. Whilst, in our specific case, we tap onto the power of GPUs to dramatically accelerate additional workloads in high performance computing (HPC) and deep learning like in our model below.

Caution: Since I do not have GPU available, this means my codes will take days to complete.

There is a lot of research and active work happening to think of ways to accelerate cloud computing.

Transfer Learning Technique

What exactly is Transfer Learning and why do we need to use this technique?

On the other hand, our task will be to train a convolutional neural network (CNN) that can identify objects in images. If images are insufficiently provided, it means we could not have enough learnings when building a neural network to obtain a high accuracy. Therefore, instead of building and training a CNN from scratch, we’ll use a pre-built and pre-trained model applying transfer learning.Transfer learning is the most popular approach in deep learning.

The basic premise of transfer learning is simple: take a model trained on a large dataset and transfer its knowledge to a smaller dataset. The idea is the convolutional layers extract general, low-level features that are applicable across images — such as edges, patterns, gradients — and the later layers identify specific features within an image such as eyes or wheels.

Thus, we can use a network trained on unrelated categories in a massive dataset (usually Imagenet) and apply it to our own problem because there are universal, low-level features shared between images.

A convolutional neural network (CNN) is a specific type of artificial neural network that uses perceptrons, a machine learning unit algorithm, for supervised learning, to analyze data. CNNs apply to image processing, natural language processing and other kinds of cognitive tasks. A convolutional neural network is also known as a ConvNet.

Add layer(s) to pre-trained VGG19Net Model

What is a pre-trained model?

On normal circumstances, we used a a pre-trained model to solve a similar machine learning problem. Instead of building a model from scratch to solve a similar problem, we use the model trained on other problem as a starting point.

While choosing a pre-trained model, one should be careful in their case. If the problem statement we have at hand is very different from the one on which the pre-trained model was trained – the prediction we would get would be very inaccurate.

In simple language VGG is a deep CNN used to classify images. The layers in VGG19 model are as follows:

Architecture

In summary, VGG19Net contains 8 CNN layers with 3 layers of pooling and 3 fully connected(FC) layers. This makes up 19 layers in VGG19Net.

Specify Loss Function and Optimizer

In most learning networks, error is calculated as the difference between the actual output y and the predicted output ŷ. The function that is used to compute this error is known as Loss Function also known as Cost function.

In my model, am using Cross-Entropy Loss (or Log Loss) function as this is a categorical classification problem.

Cross entropy measures the divergence between two probability distribution, if the cross entropy is large, which means that the difference between two distribution is large, while if the cross entropy is small, which means that two distribution is similar to each other.

Note: P is the distribution of the true labels, and Q is the probability distribution of the predictions from the model.

On the other hand, optimizer is used to minimize the prediction error or loss.

The model while experiencing the examples of the training set, updates the model parameters weight,W. These error calculations when plotted against the W is also called cost function plot J(w), since it determines the cost/penalty of the model. So minimizing the error is also called as minimization the cost function.

But how exactly do you do that? Using optimizers. Optimizers are used to update weights and biases i.e. the internal parameters of a model to reduce the error. The most important technique and the foundation of how we train and optimize our model is using Gradient Descent.

In my case, I am using stochastic gradient descent (SGD). Stochastic Gradient Descent updates the parameters using only a single training instance in each iteration. The training instance is usually selected randomly. Stochastic gradient descent is often preferred to optimize cost functions when there are hundreds of thousands of training instances or more, as it will converge more quickly than batch gradient descent.

Train the Model

So far, we have loaded the images, flatten all images to Tensor dataset, modified our pre-trained VGG19Net model, specified loss function and optimizer to be applied. Next, we are ready to run and train the model.

Test on the modified CNN Model

After mentioned earlier, it took us 2-3 days to fully complete the model above. Now that it is completed, let's run the model on some independent test images and gauge its accuracy rate.

We reached an accuracy of 66% But how can we improve further. As mentioned earlier, I am using CPU to run this neural network. Which means I am facing some level of computational resource issue. One way to improve my model further is to expand the number of iterations / num_epochs to 250. With this, we should be able to improve the accuracy further.

Conclusion

Hurray!, we have come this far. Looks like it, our model has 66% accuracy rate. This is by far as good as a normal dermatologist would assess in any given case.

Test Inference

Now that the model has been trained and tested on independent images. Let's take a look at any specific image and see how we can use the model to learn about the some specific images.

        **image.resize((550,550), resample=Image.BICUBIC)**

The function used here has an optional resample parameter, that sets the resampling filter.
It can be NEAREST, BOX, BILINEAR, HAMMING, BICUBIC or LANCZOS. These are similar to the options you might see in imaging application like Photoshop or GIMP.
The list of filters is ordered from lowest to highest quality, but the higher quality options tend to be a bit slower. BICUBIC is probably the best option for general use (the default is NEAREST, but a modern computer can run BICUBIC pretty quickly and the quality improvement is quite noticeable).

Also, resize also has an optional parameter box that specifies a region to be resized. The image is first cropped to the box, and the result is resized. The crop box is specified by a 4-tuple in exactly the same way as the crop function below.

Getting your Results

Once we trained and tested our model, we then create a CSV file to store our test predictions. Note, our test file should have exactly 600 rows, each corresponding to a different test image, plus a header row.

Our file should have exactly 3 columns in the csv file.

Samples of output should look like this:
Id,task_1,task_2
data/test/melanoma/ISIC_0013242.jpg,0,3.88E-33
data/test/melanoma/ISIC_0013321.jpg,0,2.41E-12
data/test/melanoma/ISIC_0013411.jpg,0,1.62E-31
data/test/nevus/ISIC_0016029.jpg,0,0
data/test/nevus/ISIC_0016030.jpg,7.62E-32,8.81E-28
data/test/nevus/ISIC_0016031.jpg,1.10E-14,7.91E-16
data/test/seborrheic_keratosis/ISIC_0012848.jpg,0,0
data/test/seborrheic_keratosis/ISIC_0012522.jpg,0,5.84E-33
data/test/seborrheic_keratosis/ISIC_0014386.jpg,0,1.65E-25

Before we run the probability, let's make sure there are 600 images in the test data folder.